02. C/C++ API

Why do we need a C/C++ API?

The PyTorch DQN tutorial provides a good demonstration for how deep reinforcement learning works for problems that take sensor input and produce actions. However, to successfully leverage deep learning technology in robots, we need to move to a library format that can integrate with robots and simulators. In addition, robots require real-time responses to changes in their environments, so computation performance matters. In this section, we introduce an API (application programming interface) in C/C++.

The API provides an interface to the Python code written with PyTorch, but the wrappers use Python’s low-level C to pass memory objects between the user’s application and Torch without extra copies. By using a compiled language (C/C++) instead of an interpreted one, performance is improved, and speeded up even more when GPU acceleration is leveraged.

API stack for Deep RL (from Nvidia [repo](https://github.com/dusty-nv/jetson-reinforcement))

API stack for Deep RL (from Nvidia repo)

API Repository Sample Environments

The API repository provides instructions and files to build a few different RL agents and environments from source with PyTorch on either a Jetson or other GPU x86_64 system. In addition to OpenAI Gym samples similar to those already covered here (Cartpole), the repository contains the following demos:

  • C/C++ 2D Samples
    • Catch (DQN text)
    • Fruit (2D DQN)
  • C/C++ 3D Simulation
    • Robotic Arm (3D DQN in Gazebo)
    • Rover Navigation (3D DQN in Gazebo)

The purpose of building the simple 2D samples is to test and understand the C/C++ API as we move toward the goal of using the API for robotic applications in Gazebo such as the Robotic Arm. Each of these samples will use a DQN agent to solve problems.

The DQN agent

The repo provides a base rlAgent base class that can be extended through inheritance to implement agents using various reinforcement learning algorithms. We will focus on the dqnAgent class and applying it to solve DQN reinforcement learning problems.

The following pseudocode illustrates the signature of the dqnAgent class:

class dqnAgent : public rlAgent
{
public:

    /**
     * Create a new DQN agent training instance,
     * the dimensions of a 2D image are expected.
     */
    static dqnAgent* Create( uint32_t width, uint32_t height, uint32_t channels, 
        uint32_t numActions, const char* optimizer = "RMSprop", 
        float learning_rate = 0.001, uint32_t replay_mem = 10000, 
        uint32_t batch_size = 64, float gamma = 0.9, float epsilon_start = 0.9,  
        float epsilon_end = 0.05,  float epsilon_decay = 200,
        bool allow_random = true, bool debug_mode = false);

    /**
     * Destructor
     */
    virtual ~dqnAgent();

    /**
     * From the input state, predict the next action (inference)
     * This function isn't used during training, for that see NextReward()
     */
    virtual bool NextAction( Tensor* state, int* action );

    /**
     * Next action with reward (training)
     */
    virtual bool NextReward( float reward, bool end_episode );
}

Setting up the agent

The agent is instantiated by the Create() function with the appropriate initial parameters. For each iteration of the algorithm, the environment provides sensor data, or environmental state, to the NextAction() call, which returns the agent's action to be appied to the robot or simulation. The environment's reward is issued to the NextReward() function, which kicks off the next training iteration that ensures the agent learns over time.

Let's take a detailed look at some of the parameters that can be set up in the Create() function.

Example Parameter Options:

// Define DQN API settings
#define GAME_WIDTH   64             // Set an environment width 
#define GAME_HEIGHT  64             // Set an environment height 
#define NUM_CHANNELS 1              // Set the image channels 
#define OPTIMIZER "RMSprop"         // Set a optimizer 
#define LEARNING_RATE 0.01f         // Set an optimizer learning rate
#define REPLAY_MEMORY 10000         // Set a replay memory
#define BATCH_SIZE 32               // Set a batch size
#define GAMMA 0.9f                  // Set a discount factor
#define EPS_START 0.9f              // Set a starting greedy value
#define EPS_END 0.05f               // Set a ending greedy value
#define EPS_DECAY 200               // Set a greedy decay rate
#define USE_LSTM true               // Add memory (LSTM) to network
#define LSTM_SIZE 256               // Define LSTM size
#define ALLOW_RANDOM true           // Allow RL agent to make random choices
#define DEBUG_DQN false             // Turn on or off DQN debug mode

API Knowledge Check

What advantages does the C/C++ API provide for robotics RL problems? (Check all that apply)

SOLUTION:
  • Leverages the power of GPU acceleration to speed up RL agent training
  • Provides popular RL algorithms in library form that can be integrated with robots and simulators
  • Increases execution performance by using C/C++ compilation instead of monolithic python scripts